Examples

Examples

These examples demonstrate most of the functionality of the package, its typical usage, and how to make some plots you might want to use.

The examples:

An Ising Model

The term "Ising model" is usually used to refer to a Markov random field of dichotomous random variables on a regular lattice. The graph is such that each variable shares an edge only with its nearest neighbors in each dimension. It's a traditional model for magnetic spins, where the coding $(-1,1)$ is usually used. There's one parameter per vertex (a "local magnetic field") that increases or decreases the chance of getting a $+1$ state at that vertex; and there's a single pairwise parameter that controls the strength of interaction between neighbor states.

In our terminology it's just an autologistic model with the appropriate graph. Specifically, it's an ALsimple model: one with FullUnary type unary parameter, and SimplePairwise type pairwise parameter.

We can create such a model once we have the graph. For example, let's create two 30-by-30 lattices: one without any special handling of the boundary, and one with periodic boundary conditions. This can be done with LightGraphs.jl's Grid function.

n = 30  # numer of rows and columns
using LightGraphs
G1 = Grid([n, n], periodic=false)
G2 = Grid([n, n], periodic=true)

Now create an AL model for each case. Initialize the unary parameters to Gaussian white noise. By default the pairwise parameter is set to zero, which implies independence of the variables. Use the same parameters for the two models, so the only difference betwen them is the graph.

using Random, Autologistic
Random.seed!(8888)
α = randn(n^2)
M1 = ALsimple(G1, α)
M2 = ALsimple(G2, α)

Typing M2 at the REPL shows information about the model. It's an ALsimple type with one observation of length 900.

julia> M2
Autologistic model of type ALsimple with parameter vector [α; λ].
Fields:
  responses    900×1 Bool array
  unary        900×1 FullUnary with fields:
                 α  900×1 array (unary parameter values)
  pairwise     900×900×1 SimplePairwise with fields:
                 λ      [0.0] (association parameter)
                 G      the graph (900 vertices, 1800 edges)
                 count  1 (the number of observations)
                 A      900×900 SparseMatrixCSC (the adjacency matrix)
  centering    none
  coding       (-1, 1)
  labels       ("low", "high")
  coordinates  900-element vector of Tuple{Float64,Float64}

The conditionalprobabilities function returns the probablity of observing a $+1$ state at each vertex, conditional on the vertex's neighbor values. These can be visualized as an image, using a heatmap (from Plots.jl):

using Plots
condprobs = conditionalprobabilities(M1)
hm = heatmap(reshape(condprobs, n, n), c=:grays, aspect_ratio=1,
             title="probability of +1 under independence")
plot(hm)
5 10 15 20 25 30 5 10 15 20 25 30 probability of +1 under independence 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Since the association parameter is zero, there are no neighborhood effects. The above conditional probabilities are equal to the marginal probabilities.

Next, set the association parameters to 0.75, a fairly strong association level, to introduce a neighbor effect.

setpairwiseparameters!(M1, [0.75])
setpairwiseparameters!(M2, [0.75])

A quick way to see the effect of this parameter is to observe random samples from the models. The sample function can be used to do this. For this example, use perfect sampling using a bounding chain algorithm.

s1 = sample(M1, method=perfect_bounding_chain)
s2 = sample(M2, method=perfect_bounding_chain)

Other options are available for sampling. The enumeration SamplingMethods lists them. The samples we have just drawn can also be visualized using heatmap:

pl1 = heatmap(reshape(s1, n, n), c=:grays, colorbar=false, title="regular boundary");
pl2 = heatmap(reshape(s2, n, n), c=:grays, colorbar=false, title="periodic boundary");
plot(pl1, pl2, size=(800,400), aspect_ratio=1)
5 10 15 20 25 30 5 10 15 20 25 30 regular boundary 5 10 15 20 25 30 5 10 15 20 25 30 periodic boundary

In these plots, black indicates the low state, and white the high state. A lot of local clustering is occurring in the samples due to the neighbor effects.

To see the long-run differences between the two models, we can look at the marginal probabilities. They can be estimated by drawing many samples and averaging them (note that running this code chunk can take a couple of minutes):

marg1 = sample(M1, 500, method=perfect_bounding_chain, verbose=true, average=true)
marg2 = sample(M2, 500, method=perfect_bounding_chain, verbose=true, average=true)
pl3 = heatmap(reshape(marg1, n, n), c=:grays,
              colorbar=false, title="regular boundary");
pl4 = heatmap(reshape(marg2, n, n), c=:grays,
              colorbar=false, title="periodic boundary");
plot(pl3, pl4, size=(800,400), aspect_ratio=1)
savefig("marginal-probs.png")

The figure marginal-probs.png looks like this:

marginal-probs.png

Although the differences between the two marginal distributions are not striking, the extra edges connecting top/bottom and left/right do have some influence on the probabilities at the periphery of the square.

As a final demonstration, perform Gibbs sampling for model M2, starting from a random state. Display a gif animation of the progress.

nframes = 150
gibbs_steps = sample(M2, nframes, method=Gibbs)
anim = @animate for i =  1:nframes
    heatmap(reshape(gibbs_steps[:,i], n, n), c=:grays, colorbar=false, 
            aspect_ratio=1, title="Gibbs sampling: step $(i)")
end
gif(anim, "ising_gif.gif", fps=10)

ising_gif.gif

Clustered Binary Data (Small $n$)

The retinitis pigmentosa data set (obtained from this source) is an opthalmology data set. The data comes from 444 patients that had both eyes examined. The data can be loaded with Autologistic.datasets:

julia> using Autologistic, DataFrames, LightGraphs

julia> df = Autologistic.datasets("pigmentosa");

julia> first(df, 6)
6×9 DataFrame. Omitted printing of 2 columns
│ Row │ ID     │ aut_dom │ aut_rec │ sex_link │ age    │ sex    │ psc    │
│     │ Int64⍰ │ Int64⍰  │ Int64⍰  │ Int64⍰   │ Int64⍰ │ Int64⍰ │ Int64⍰ │
├─────┼────────┼─────────┼─────────┼──────────┼────────┼────────┼────────┤
│ 1   │ 7      │ 0       │ 0       │ 0        │ 27     │ 0      │ 1      │
│ 2   │ 7      │ 0       │ 0       │ 0        │ 27     │ 0      │ 1      │
│ 3   │ 8      │ 0       │ 1       │ 0        │ 44     │ 1      │ 0      │
│ 4   │ 8      │ 0       │ 1       │ 0        │ 44     │ 1      │ 0      │
│ 5   │ 19     │ 0       │ 0       │ 0        │ 24     │ 0      │ 1      │
│ 6   │ 19     │ 0       │ 0       │ 0        │ 24     │ 0      │ 1      │

julia> describe(df)
9×8 DataFrame. Omitted printing of 1 columns
│ Row │ variable │ mean      │ min   │ median  │ max   │ nunique │ nmissing │
│     │ Symbol   │ Float64   │ Int64 │ Float64 │ Int64 │ Nothing │ Int64    │
├─────┼──────────┼───────────┼───────┼─────────┼───────┼─────────┼──────────┤
│ 1   │ ID       │ 1569.22   │ 7     │ 1466.5  │ 3392  │         │ 0        │
│ 2   │ aut_dom  │ 0.0990991 │ 0     │ 0.0     │ 1     │         │ 0        │
│ 3   │ aut_rec  │ 0.141892  │ 0     │ 0.0     │ 1     │         │ 0        │
│ 4   │ sex_link │ 0.0608108 │ 0     │ 0.0     │ 1     │         │ 0        │
│ 5   │ age      │ 34.2342   │ 6     │ 32.5    │ 80    │         │ 0        │
│ 6   │ sex      │ 0.576577  │ 0     │ 1.0     │ 1     │         │ 0        │
│ 7   │ psc      │ 0.432432  │ 0     │ 0.0     │ 1     │         │ 0        │
│ 8   │ eye      │ 0.5       │ 0     │ 0.5     │ 1     │         │ 0        │
│ 9   │ va       │ 0.504505  │ 0     │ 1.0     │ 1     │         │ 0        │

The response for each eye is va, an indicator of poor visual acuity (coded 0 = no, 1 = yes in the data set). Seven covariates were also recorded for each eye:

The last four factors are relevant clinical observations, and the first three are genetic factors. The data set also includes an ID column with an ID number specific to each patient. Eyes with the same ID come from the same person.

The natural unit of analysis is the eye, but pairs of observations from the same patient are "clustered" because the occurrence of acuity loss in the left and right eye is probably correlated. We can model each person's two va outcomes as two dichotomous random variables with a 2-vertex, 1-edge graph.

julia> G = Graph(2,1)
{2, 1} undirected simple Int64 graph

Each of the 444 bivariate observations has this graph, and each has its own set of covariates.

If we include all seven predictors, plus intercept, in our model, we have 2 variables per observation, 8 predictors, and 444 observations.

Before creating the model we need to re-structure the covariates. The data in df has one row per eye, with the variable ID indicating which eyes belong to the same patient. We need to rearrange the responses (Y) and the predictors (X) into arrays suitable for our autologistic models, namely:

X = Array{Float64,3}(undef, 2, 8, 444);
Y = Array{Float64,2}(undef, 2, 444);
for i in 1:2:888
    patient = Int((i+1)/2)
    X[1,:,patient] = [1 permutedims(Vector(df[i,2:8]))]
    X[2,:,patient] = [1 permutedims(Vector(df[i+1,2:8]))]
    Y[:,patient] = convert(Array, df[i:i+1, 9])
end

For example, patient 100 had responses

julia> Y[:,100]
2-element Array{Float64,1}:
 1.0
 0.0

Indicating visual acuity loss in the left eye, but not in the right. The predictors for this individual are

julia> X[:,:,100]
2×8 Array{Float64,2}:
 1.0  0.0  0.0  0.0  18.0  1.0  1.0  0.0
 1.0  0.0  0.0  0.0  18.0  1.0  1.0  1.0

Now we can create our autologistic regression model.

model = ALRsimple(G, X, Y=Y)
Autologistic regression model of type ALRsimple with parameter vector [β; λ].
Fields:
  responses    2×444 Bool array
  unary        2×444 LinPredUnary with fields:
                 X  2×8×444 array (covariates)
                 β  8-element vector (regression coefficients)
  pairwise     2×2×444 SimplePairwise with fields:
                 λ      [0.0] (association parameter)
                 G      the graph (2 vertices, 1 edges)
                 count  444 (the number of observations)
                 A      2×2 SparseMatrixCSC (the adjacency matrix)
  centering    none
  coding       (-1, 1)
  labels       ("low", "high")
  coordinates  2-element vector of Tuple{Float64,Float64}

This creates a model with the "simple pairwise" structure, using a single association parameter. The default is to use no centering adjustment, and to use coding $(-1,1)$ for the responses. This "symmetric" version of the model is recommended for a variety of reasons. Using different coding or centering choices is only recommended if you have a thorough understanding of what you are doing; but if you wish to use different choices, this can easily be done using keyword arguments. For example, ALRsimple(G, X, Y=Y, coding=(0,1), centering=expectation) creates the "centered autologistic model" that has appeared in the literature (e.g., here and here).

The model has nine parameters (eight regression coefficients plus the association parameter). All parameters are initialized to zero:

julia> getparameters(model)
9-element Array{Float64,1}:
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0
 0.0

When we call getparameters, the vector returned always has the unary parameters first, with the pairwise parameter(s) appended at the end.

Because there are only two vertices in the graph, we can use the full likelihood (fit_ml! function) to do parameter estimation. This function returns a structure with the estimates as well as standard errors, p-values, and 95% confidence intervals for the parameter estimates.

out = fit_ml!(model)
Autologistic model fitting results. Its non-empty fields are:
  estimate       9-element vector of parameter estimates
  se             9-element vector of standard errors
  pvalues        9-element vector of 2-sided p-values
  CIs            9-element vector of 95% confidence intervals (as tuples)
  optim          the output of the call to optimize()
  Hinv           the inverse of the Hessian, evaluated at the optimum
  kwargs         extra keyword arguments passed to sample()
  convergence    true
Use summary(fit; [parnames, sigdigits]) to see a table of estimates.
For pseudolikelihood, use oneboot() and addboot!() to add bootstrap after the fact.

To view the estimation results, use summary:

summary(out, parnames = ["icept", "aut_dom", "aut_rec", "sex_link", "age", "sex",
        "psc", "eye", "λ"])
name       est        se        p-value   95% CI
icept      -0.264     0.0973    0.00667    (-0.455, -0.0732)
aut_dom    -0.167     0.0932    0.0734     (-0.349, 0.0158)
aut_rec     0.104     0.0787    0.188     (-0.0507, 0.258)
sex_link    0.301     0.126     0.0168     (0.0542, 0.547)
age         0.00492   0.00184   0.00763   (0.00131, 0.00853)
sex         0.0207    0.0557    0.711     (-0.0885, 0.13)
psc         0.139     0.0589    0.0185     (0.0233, 0.254)
eye         0.0273    0.12      0.819      (-0.207, 0.262)
λ           0.818     0.0656    0.0          (0.69, 0.947)

From this we see that the association parameter is fairly large (0.818), supporting the idea that the left and right eyes are associated. It is also highly statistically significant. Among the covariates, sex_link, age, and psc are all statistically significant.

Spatial Binary Regression

ALR models are natural candidates for analysis of spatial binary data, where locations in the same neighborhood are more likely to have the same outcome than sites that are far apart. The hydrocotyle data provide a typical example. The response in this data set is the presence/absence of a certain plant species in a grid of 2995 regions covering Germany. The data set is included in Autologistic.jl:

julia> using Autologistic, DataFrames, LightGraphs

julia> df = Autologistic.datasets("hydrocotyle")
2995×5 DataFrame
│ Row  │ X      │ Y      │ altitude │ temperature │ obs    │
│      │ Int64⍰ │ Int64⍰ │ Float64⍰ │ Float64⍰    │ Int64⍰ │
├──────┼────────┼────────┼──────────┼─────────────┼────────┤
│ 1    │ 16     │ 1      │ 1.71139  │ 8.2665      │ 1      │
│ 2    │ 15     │ 2      │ 1.8602   │ 8.15482     │ 1      │
│ 3    │ 16     │ 2      │ 1.7858   │ 8.09898     │ 1      │
│ 4    │ 17     │ 2      │ 1.7858   │ 8.09898     │ 0      │
│ 5    │ 18     │ 2      │ 1.7858   │ 8.09898     │ 0      │
│ 6    │ 19     │ 2      │ 1.8602   │ 8.09898     │ 1      │
│ 7    │ 15     │ 3      │ 1.8602   │ 8.15482     │ 1      │
⋮
│ 2988 │ 44     │ 77     │ 17.7836  │ 1.17513     │ 0      │
│ 2989 │ 26     │ 78     │ 16.4442  │ 2.6269      │ 0      │
│ 2990 │ 27     │ 78     │ 15.3281  │ 3.5203      │ 0      │
│ 2991 │ 28     │ 78     │ 17.6347  │ 1.90102     │ 0      │
│ 2992 │ 31     │ 78     │ 16.3698  │ -0.5        │ 0      │
│ 2993 │ 32     │ 78     │ 15.4025  │ -0.5        │ 0      │
│ 2994 │ 33     │ 78     │ 16.3698  │ 0.281726    │ 0      │
│ 2995 │ 27     │ 79     │ 18.23    │ 1.95685     │ 0      │

In the data frame, the variables X and Y give the spatial coordinates of each region (in dimensionless integer units), obs gives the presence/absence data (1 = presence), and altitude and temperature are covariates.

We will use an ALRsimple model for these data. The graph can be formed using makespatialgraph:

locations = [(df.X[i], df.Y[i]) for i in 1:size(df,1)]
g = makespatialgraph(locations, 1.0)

makespatialgraph creates the graph by adding edges between any vertices with Euclidean distance smaller than a cutoff distance (Lightgraphs.jl has a euclidean_graph function that does the same thing). For these data arranged on a grid, a threshold of 1.0 will make a 4-nearest-neighbors lattice. Letting the threshold be sqrt(2) would make an 8-nearest-neighbors lattice.

We can visualize the graph, the responses, and the predictors using GraphRecipes.jl (there are several other options for plotting graphs as well).

using GraphRecipes, Plots

# Function to convert a value to a gray shade
makegray(x, lo, hi) = RGB([(x-lo)/(hi-lo) for i=1:3]...)

# Function to plot the graph with node shading determined by v.
# Plot each node as a square and don't show the edges.
function myplot(v, lohi=nothing)
    if lohi==nothing
        colors = makegray.(v, minimum(v), maximum(v))
    else
        colors = makegray.(v, lohi[1], lohi[2])
    end
    return graphplot(g.G, x=df.X, y=df.Y, background_color = :lightblue,
                marker = :square, markersize=2, markerstrokewidth=0,
                markercolor = colors, yflip = true, linecolor=nothing)
end

# Make the plot
plot(myplot(df.obs), myplot(df.altitude), myplot(df.temperature),
     layout=(1,3), size=(800,300), titlefontsize=8,
     title=hcat("Species Presence (white = yes)", "Altitude (lighter = higher)",
                "Temperature (lighter = higher)"))
Species Presence (white = yes) Altitude (lighter = higher) Temperature (lighter = higher)

Constructing the model

We can see that the species primarily is found at low-altitude locations. To model the effect of altitude and temperature on species presence, construct an ALRsimple model.

# Autologistic.jl requres predictors to be a matrix of Float64
Xmatrix = Array{Float64}([ones(2995) df.altitude df.temperature])

# Create the model
hydro = ALRsimple(g.G, Xmatrix, Y=df.obs)
Autologistic regression model of type ALRsimple with parameter vector [β; λ].
Fields:
  responses    2995×1 Bool array
  unary        2995×1 LinPredUnary with fields:
                 X  2995×3×1 array (covariates)
                 β  3-element vector (regression coefficients)
  pairwise     2995×2995×1 SimplePairwise with fields:
                 λ      [0.0] (association parameter)
                 G      the graph (2995 vertices, 5804 edges)
                 count  1 (the number of observations)
                 A      2995×2995 SparseMatrixCSC (the adjacency matrix)
  centering    none
  coding       (-1, 1)
  labels       ("low", "high")
  coordinates  2995-element vector of Tuple{Float64,Float64}

The model hydro has four parameters: three regression coefficients (interceept, altitude, and temperature) plus an association parameter. It is a "symmetric" autologistic model, because it has a coding symmetric around zero and no centering term.

Fitting the model by pseudolikelihood

With 2995 nodes in the graph, the likelihood is intractable for this case. Use fit_pl! to do parameter estimation by pseudolikelihood instead. The fitting function uses the BFGS algorithm via Optim.jl. Any of Optim's general options can be passed to fit_pl! to control the optimization. We have found that allow_f_increases often aids convergence. It is used here:

julia> fit1 = fit_pl!(hydro, allow_f_increases=true)
Autologistic model fitting results. Its non-empty fields are:
  estimate       4-element vector of parameter estimates
  optim          the output of the call to optimize()
  kwargs         extra keyword arguments passed to sample()
  convergence    true
Use summary(fit; [parnames, sigdigits]) to see a table of estimates.
For pseudolikelihood, use oneboot() and addboot!() to add bootstrap after the fact.

julia> parnames = ["intercept", "altitude", "temperature", "association"];

julia> summary(fit1, parnames=parnames)
name          est       se   p-value   95% CI
intercept     -0.192                         
altitude      -0.0573                        
temperature    0.0498                        
association    0.361

fit_pl! mutates the model object by setting its parameters to the optimal values. It also returns an object, of type ALfit, which holds information about the result. Calling summary(fit1) produces a summary table of the estimates. For now there are no standard errors. This will be addressed below.

To quickly visualize the quality of the fitted model, we can use sampling to get the marginal probabilities, and to observe specific samples.

# Average 500 samples to estimate marginal probability of species presence
marginal1 = sample(hydro, 500, method=perfect_bounding_chain, average=true)

# Draw 2 random samples for visualizing generated data.
draws = sample(hydro, 2, method=perfect_bounding_chain)

# Plot them
plot(myplot(marginal1, (0,1)), myplot(draws[:,1]), myplot(draws[:,2]),
     layout=(1,3), size=(800,300), titlefontsize=8,
     title=["Marginal Probability" "Random sample 1" "Random Sample 2"])
Marginal Probability Random sample 1 Random Sample 2

In the above code, perfect sampling was used to draw samples from the fitted distribution. The marginal plot shows consistency with the observed data, and the two generated data sets show a level of spatial clustering similar to the observed data.

Error estimation 1: bootstrap after the fact

A parametric bootstrap can be used to get an estimate of the precision of the estimates returned by fit_pl!. The function oneboot has been included in the package to facilitate this. Each call of oneboot draws a random sample from the fitted distribution, then re-fits the model using this sample as the responses. It returns a named tuple giving the sample, the parameter estimates, and a convergence flag. Any extra keyword arguments are passed on to sample or optimize as appropriate to control the process.

julia> # Do one bootstrap replication for demonstration purposes.
       oneboot(hydro, allow_f_increases=true, method=perfect_bounding_chain)
(sample = [-1.0, 1.0, -1.0, 1.0, 1.0, 1.0, 1.0, -1.0, -1.0, 1.0  …  -1.0, -1.0, -1.0, -1.0, 1.0, -1.0, -1.0, -1.0, -1.0, -1.0], estimate = [-0.398996, -0.0493274, 0.0708516, 0.361303], convergence = true)

An array of the tuples produced by oneboot can be fed to addboot! to update the fitting summary with precision estimates:

nboot = 2000
boots = [oneboot(hydro, allow_f_increases=true, method=perfect_bounding_chain) for i=1:nboot]
addboot!(fit1, boots)

At the time of writing, this took about 5.7 minutes on the author's workstation. After adding the bootstrap information, the fitting results look like this:

julia> summary(fit1,parnames=parnames)
name          est       se       95% CI
intercept     -0.192    0.319     (-0.858, 0.4)
altitude      -0.0573   0.015    (-0.0887, -0.0296)
temperature    0.0498   0.0361   (-0.0163, 0.126)
association    0.361    0.018      (0.326, 0.397)

Confidence intervals for altitude and the association parameter both exclude zero, so we conclude that they are statistically significant.

Error estimation 2: (parallel) bootstrap when fitting

Alternatively, the bootstrap inference procedure can be done at the same time as fitting by providing the keyword argument nboot (which specifies the number of bootstrap samples to generate) when calling fit_pl!. If you do this, and you have more than one worker process available, then the bootstrap will be done in parallel across the workers (using an @distributed for loop). This makes it easy to achieve speed gains from parallelism on multicore workstations.

using Distributed                  # needed for parallel computing
addprocs(6)                        # create 6 worker processes
@everywhere using Autologistic     # workers need the package loaded
fit2 = fit_pl!(hydro, nboot=2000,
               allow_f_increases=true, method=perfect_bounding_chain)

In this case the 2000 bootstrap replications took about 1.1 minutes on the same 6-core workstation. The output object fit2 already includes the confidence intervals:

julia> summary(fit2, parnames=parnames)
name          est       se       95% CI
intercept     -0.192    0.33        (-0.9, 0.407)
altitude      -0.0573   0.0157   (-0.0897, -0.0297)
temperature    0.0498   0.0372   (-0.0169, 0.13)
association    0.361    0.0179     (0.327, 0.396)

For parallel computing of the bootstrap in other settings (eg. on a cluster), it should be fairly simple implement in a script, using the oneboot/addboot! approach of the previous section.

Comparison to logistic regression

If we ignore spatial association, and just fit the model with ordinary logistic regression, we get the following result:

using GLM
LR = glm(@formula(obs ~ altitude + temperature), df, Bernoulli(), LogitLink());
coef(LR)
3-element Array{Float64,1}:
  5.967115773789926
 -0.8273369416876368
 -0.37382566103475834

The logistic regression coefficients are not directly comparable to the ALR coefficients, because the ALR model uses coding $(-1, 1)$. If we want to compare the two models, we can transform the symmetric model to use the $(0, 1)$ coding.

The symmetric ALR model with (0,1) coding

The symmetric ALR model with $(-1, 1)$ coding is equivalent to a model with $(0,1)$ coding and a constant centering adjustment of 0.5. If the original model has coefficients $(β, λ)$, the transformed model has coefficients $(2β, 4λ)$.

To compare the symmetric ALR model to a logistic regression model, either

  1. (recommended) Fit the $(-1,1)$ ALRsimple model and transform the parameters, or

  2. Fit an ALRsimple model with coding=(0,1) and centering=onehalf.

Using option 1 with model hydro, we have

transformed_pars = [2*getunaryparameters(hydro); 4*getpairwiseparameters(hydro)]
4-element Array{Float64,1}:
 -0.3831243934309312
 -0.11460656341507373
  0.09962289682243582
  1.4457110224855951

We see that the association parameter is large (1.45), but the regression parameters are small compared to the logistic regression model. This is typical: ignoring spatial association tends to result in overestimation of the regression effects.

To see that option 2 is also valid, we can fit the transformed model directly:

same_as_hydro = ALRsimple(g.G, Xmatrix, Y=df.obs, coding=(0,1), centering=onehalf)
fit3 = fit_pl!(same_as_hydro, allow_f_increases=true)
fit3.estimate
4-element Array{Float64,1}:
 -0.3831244015693046
 -0.11460656298156281
  0.09962289760534766
  1.4457110228463246

We see that the parameter estimates from same_as_hydro are equal to the hydro estimates after transformation.

Comparison to the centered model

The centered autologistic model can be easily constructed for comparison with the symmetric one. We can start with a copy of the symmetric model we have already created.

The pseudolikelihood function for the centered model is not convex. Three different local optima were found. For this demonstration we are using the start argument to let optimization start from a point close to the best minimum found.

centered_hydro = deepcopy(hydro)
centered_hydro.coding = (0,1)
centered_hydro.centering = expectation
fit4 = fit_pl!(centered_hydro, nboot=2000, start=[-1.7, -0.17, 0.0, 1.5],
               allow_f_increases=true, method=perfect_bounding_chain)
julia> summary(fit4, parnames=parnames)
name          est       se       95% CI
intercept     -2.29     1.07       (-4.6, -0.345)
altitude      -0.16     0.0429   (-0.258, -0.088)
temperature    0.0634   0.115    (-0.138, 0.32)
association    1.51     0.0505     (1.42, 1.61)

julia> round.([fit3.estimate fit4.estimate], digits=3)
4×2 Array{Float64,2}:
 -0.383  -2.29
 -0.115  -0.16
  0.1     0.063
  1.446   1.506

The main difference between the symmetric ALR model and the centered one is the intercept, which changes from -0.383 to -2.29 when changing to the centered model. This is not a small difference. To see this, compare what the two models predict in the absence of spatial association.

# Change models to have association parameters equal to zero
# Remember parameters are always Array{Float64,1}.
setpairwiseparameters!(centered_hydro, [0.0])
setpairwiseparameters!(hydro, [0.0])

# Sample to estimate marginal probabilities
centered_marg = sample(centered_hydro, 500, method=perfect_bounding_chain, average=true)
symmetric_marg = sample(hydro, 500, method=perfect_bounding_chain, average=true)

# Plot to compare
plot(myplot(centered_marg, (0,1)), myplot(symmetric_marg, (0,1)),
     layout=(1,2), size=(500,300), titlefontsize=8,
     title=["Centered Model" "Symmetric Model"])

noassociation.png

If we remove the spatial association term, the centered model predicts a very low probability of seeing the plant anywhere–including in locations with low elevation, where the plant is plentiful in reality. This is a manifestation of a problem with the centered model, where parameter interpretability is lost when association becomes strong.